Graph Neural Networks (GNNs) have become increasingly important in recent years due to their state-of-the-art performance on many important downstream applications. Existing GNNs have mostly focused on learning a single node representation, despite that a node often exhibits polysemous behavior in different contexts. In this work, we develop a persona-based graph neural network framework called PersonaSAGE that learns multiple persona-based embeddings for each node in the graph. Such disentangled representations are more interpretable and useful than a single embedding. Furthermore, PersonaSAGE learns the appropriate set of persona embeddings for each node in the graph, and every node can have a different number of assigned persona embeddings. The framework is flexible enough and the general design helps in the wide applicability of the learned embeddings to suit the domain. We utilize publicly available benchmark datasets to evaluate our approach and against a variety of baselines. The experiments demonstrate the effectiveness of PersonaSAGE for a variety of important tasks including link prediction where we achieve an average gain of 15% while remaining competitive for node classification. Finally, we also demonstrate the utility of PersonaSAGE with a case study for personalized recommendation of different entity types in a data management platform.
translated by 谷歌翻译
The booming development and huge market of micro-videos bring new e-commerce channels for merchants. Currently, more micro-video publishers prefer to embed relevant ads into their micro-videos, which not only provides them with business income but helps the audiences to discover their interesting products. However, due to the micro-video recording by unprofessional equipment, involving various topics and including multiple modalities, it is challenging to locate the products related to micro-videos efficiently, appropriately, and accurately. We formulate the microvideo-product retrieval task, which is the first attempt to explore the retrieval between the multi-modal and multi-modal instances. A novel approach named Multi-Queue Momentum Contrast (MQMC) network is proposed for bidirectional retrieval, consisting of the uni-modal feature and multi-modal instance representation learning. Moreover, a discriminative selection strategy with a multi-queue is used to distinguish the importance of different negatives based on their categories. We collect two large-scale microvideo-product datasets (MVS and MVS-large) for evaluation and manually construct the hierarchical category ontology, which covers sundry products in daily life. Extensive experiments show that MQMC outperforms the state-of-the-art baselines. Our replication package (including code, dataset, etc.) is publicly available at https://github.com/duyali2000/MQMC.
translated by 谷歌翻译
Proper functioning of connected and automated vehicles (CAVs) is crucial for the safety and efficiency of future intelligent transport systems. Meanwhile, transitioning to fully autonomous driving requires a long period of mixed autonomy traffic, including both CAVs and human-driven vehicles. Thus, collaboration decision-making for CAVs is essential to generate appropriate driving behaviors to enhance the safety and efficiency of mixed autonomy traffic. In recent years, deep reinforcement learning (DRL) has been widely used in solving decision-making problems. However, the existing DRL-based methods have been mainly focused on solving the decision-making of a single CAV. Using the existing DRL-based methods in mixed autonomy traffic cannot accurately represent the mutual effects of vehicles and model dynamic traffic environments. To address these shortcomings, this article proposes a graph reinforcement learning (GRL) approach for multi-agent decision-making of CAVs in mixed autonomy traffic. First, a generic and modular GRL framework is designed. Then, a systematic review of DRL and GRL methods is presented, focusing on the problems addressed in recent research. Moreover, a comparative study on different GRL methods is further proposed based on the designed framework to verify the effectiveness of GRL methods. Results show that the GRL methods can well optimize the performance of multi-agent decision-making for CAVs in mixed autonomy traffic compared to the DRL methods. Finally, challenges and future research directions are summarized. This study can provide a valuable research reference for solving the multi-agent decision-making problems of CAVs in mixed autonomy traffic and can promote the implementation of GRL-based methods into intelligent transportation systems. The source code of our work can be found at https://github.com/Jacklinkk/Graph_CAVs.
translated by 谷歌翻译
Fact verification has attracted a lot of research attention recently, e.g., in journalism, marketing, and policymaking, as misinformation and disinformation online can sway one's opinion and affect one's actions. While fact-checking is a hard task in general, in many cases, false statements can be easily debunked based on analytics over tables with reliable information. Hence, table-based fact verification has recently emerged as an important and growing research area. Yet, progress has been limited due to the lack of datasets that can be used to pre-train language models (LMs) to be aware of common table operations, such as aggregating a column or comparing tuples. To bridge this gap, in this paper we introduce PASTA, a novel state-of-the-art framework for table-based fact verification via pre-training with synthesized sentence-table cloze questions. In particular, we design six types of common sentence-table cloze tasks, including Filter, Aggregation, Superlative, Comparative, Ordinal, and Unique, based on which we synthesize a large corpus consisting of 1.2 million sentence-table pairs from WikiTables. PASTA uses a recent pre-trained LM, DeBERTaV3, and further pretrains it on our corpus. Our experimental results show that PASTA achieves new state-of-the-art performance on two table-based fact verification benchmarks: TabFact and SEM-TAB-FACTS. In particular, on the complex set of TabFact, which contains multiple operations, PASTA largely outperforms the previous state of the art by 4.7 points (85.6% vs. 80.9%), and the gap between PASTA and human performance on the small TabFact test set is narrowed to just 1.5 points (90.6% vs. 92.1%).
translated by 谷歌翻译
图对比度学习(GCL)一直是图形自学学习的新兴解决方案。 GCL的核心原理是在正视图中降低样品之间的距离,但在负视图中增加样品之间的距离。在实现有希望的性能的同时,当前的GCL方法仍然受到两个局限性:(1)增强的不可控制的有效性,该图扰动可能会产生针对语义和图形数据的特征流程的无效视图; (2)不可靠的二进制对比理由,对于非欧几里得图数据而言,难以确定构造观点的积极性和负面性。为了应对上述局限性,我们提出了一个新的对比度学习范式,即图形软对比度学习(GSCL),该范例通过排名的社区无需任何增强和二进制对比符合性,在较细性的范围内进行对比度学习。 GSCL建立在图接近的基本假设上,即连接的邻居比遥远的节点更相似。具体而言,我们在配对和列表的封闭式排名中,以保留附近的相对排名关系。此外,随着邻里规模的指数增长,考虑了更多的啤酒花,我们提出了提高学习效率的邻里抽样策略。广泛的实验结果表明,我们提出的GSCL可以始终如一地在各种公共数据集上实现与GCL相当复杂的各种公共数据集的最新性能。
translated by 谷歌翻译
在本文中,我们研究了Micro-Video平台中的对象效果建议的新主题,这对于许多实际应用(例如广告插入)来说是一项具有挑战性但重要的任务。为了避免引入由图像框架直接学习视频内容引起的背景偏见的问题,我们建议利用3D人类姿势中隐藏的有意义的肢体语言进行推荐。为此,在这项工作中,引入了一种新型的人类姿势驱动的对象效应建议网络称为poserec。 Poserec利用了3D人姿势检测的优势,并从多框架3D人姿势中学习信息进行视频项目注册,从而导致高质量的对象效应建议性能。此外,为了解决对象效应建议中存在的固有的歧义和稀疏性问题,我们进一步提出了一种新颖的物品感知的隐性原型学习模块,并提供了一种新颖的姿势感知的托管性托管性硬性阴性挖掘模块,以更好地学习姿势 - 项目。更重要的是,为了为新研究主题进行基准方法,我们构建了一个新数据集,用于对象效果建议,名为Pose-Obe。对姿势攻击的广泛实验表明,我们的方法比强基础可以取得更高的性能。
translated by 谷歌翻译
有效地保留和编码结构功能从不规则和稀疏点点中的对象中的对象是对点云上3D对象检测的关键挑战。最近,变形金刚在许多2D甚至3D视觉任务上都表现出了有希望的表现。与固定和刚性卷积内核相比,变压器中的自发机制可以适应地排除无关或嘈杂点,因此适合保留不规则的LIDAR点云中的局部空间结构。但是,Transformer仅根据自我发项机制对点特征执行简单的总和,所有点具有相同的价值变换。这种各向同性操作缺乏捕获面向方向距离的局部结构的能力,这对于3D对象检测很重要。在这项工作中,我们提出了一个结构插入变压器(Seformer),它不仅可以将本地结构保存为传统变压器,而且还可以编码本地结构。与传统变压器中的自我发挥机制相比,Seformer基于与查询点的相对方向和距离学习了价值点的不同特征变换。然后,我们提出了一个基于Seformer的网络,用于高性能3D对象检测。广泛的实验表明,所提出的体系结构可以在Waymo Open Datatet上实现SOTA结果,这是自动驾驶的最大3D检测基准。具体而言,Seformer获得79.02%的地图,比现有作品高1.2%。我们将发布代码。
translated by 谷歌翻译
在这项工作中,我们探讨了用于语义分割知识蒸馏的数据增强。为了避免过度适合教师网络中的噪音,大量培训示例对于知识蒸馏至关重要。 Imagelevel论证技术(例如翻转,翻译或旋转)在先前的知识蒸馏框架中广泛使用。受到功能空间上语义方向的最新进展的启发,我们建议在功能空间中包括以进行有效蒸馏的功能。具体而言,给定语义方向,可以在功能空间中为学生获得无限数量的增强。此外,分析表明,可以通过最大程度地减少增强损失的上限来同时优化这些增强。基于观察结果,开发了一种用于语义分割的知识蒸馏的新算法。对四个语义分割基准测试的广泛实验表明,所提出的方法可以提高当前知识蒸馏方法的性能而没有任何明显的开销。代码可在以下网址获得:https://github.com/jianlong-yuan/fakd。
translated by 谷歌翻译
捆绑式推荐系统向用户推荐一组物品(例如裤子,衬衫和鞋子),但他们经常遇到两个问题:重大的互动稀疏性和大型输出空间。在这项工作中,我们扩展了多轮对话建议(MCR)以减轻这些问题。 MCR是使用对话范式通过询问标签(例如类别或属性)的用户偏好来引起用户兴趣的MCR,并在多个回合中处理用户反馈,是一个新兴的建议设置,以获取用户反馈并缩小输出空间,但具有缩小的输出空间没有在捆绑建议的背景下探索。在这项工作中,我们提出了一个名为Bundle MCR的新颖推荐任务。我们首先提出了一个新框架,以将MCR作为Markov决策过程(MDP),其中有多个代理,用于用户建模,咨询和反馈处理。在此框架下,我们向(1)推荐项目,(2)提出问题和(3)基于捆绑感的对话状态来管理对话。此外,要有效地训练Bunt,我们提出了两阶段的培训策略。在离线预训练阶段,Bunt使用多个披肩任务进行训练,以模仿对话中的捆绑互动。然后,在在线微调阶段,用户交互增强了Bunt代理。我们在多个离线数据集以及人类评估上进行的实验显示了将MCR框架扩展到捆绑设置的价值以及我们的Bunt设计的有效性。
translated by 谷歌翻译
现有的类新终身学习研究仅使用单标签的数据,这限制了其对多标签数据的适应性。本文研究了终身多标签(LML)分类,该分类在连续的多标签分类数据流中构建了在线类新型分类器。在LML分类中使用部分标签的数据培训可能会导致旧课程中更严重的灾难性遗忘。为了解决该问题,该研究提出了一个增强图卷积网络(AGCN),并在顺序的部分标签任务中具有建筑增强相关矩阵(ACM)。两个基准的结果表明,该方法可有效地分类和减少遗忘。
translated by 谷歌翻译